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Abstract 

We study a Bayesian approach to nonparametric estimation of the pe- 
riodic drift function of a one-dimensional diffusion from continuous-time 
data. We rewrite the likelihood in terms of Riemann integrals, by in- 
troducing the local time of the process, and specify a centered Gaussian 
prior on the drift with a precision operator that is of differential form. 
It is proved that this is a conjugate prior for the likelihood and hence 
that the posterior is also Gaussian. We give an explicit expression for the 
posterior precision operator, also of differential form, and show that the 
posterior mean is the solution of a differential equation requiring inversion 
of the posterior precision for its solution. Moreover, we bound the rate 
at which the posterior contracts around the true drift function. Our for- 
mulation of the estimation problem leads to algorithms which are readily 
implementable and analyzed using ideas from the numerical analysis of 
differential equations. The central results proved here require tools from 
the analysis of differential equations, together with new functional limit 
theorems for the local time of diffusions on the circle. 

1 Introduction 

Diffusion processes are routinely used as statistical models for a large variety of 
phenomena including molecular dynamics, econometrics and climate dynamics 
(see for instance ^ and \l9i)- Such a process can be specified via the drift 
and diffusion functions of a stochastic differential equation driven by a Brownian 
motion W. Even in one dimension, this class of processes attracts great applied 
interest. In this case, provided the diffusion function a is known and under mild 
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additional assumptions, one can transform the process such that the diffusion 
function is constant: 

dXt ^b{Xt)dt + dWt. (1.1) 

This is the form we consider here. 

We are interested in the statistical problem of recovering the drift function 
b given an observed path of the diffusion, {Xt : t £ [0,T]), which is a solution 
of (II. ip . Whenever application-driven insight into the form of the drift b is 
available, one can attempt to exploit this by postulating a parametric model 
for b, indexed by some finite-dimensional parameter e 9 C R''. The statis- 
tical problem then reduces to estimating the parameter 9, see e.g. [35] for an 
overview of this well-researched area. In other cases however, one has to resort 
to nonparametric methods for making inference on the function b. Several such 
methods have been proposed in the literature. An incomplete list include kernel 
methods (e.g. [5], [12], [3S]), penalized likelihood methods (e.g. [5]), and spectral 
approaches [2]. 

In this paper we investigate recently developed Bayesian methodology for es- 
timating the drift function of a diffusion based on continuous-time observations 
X^ = {Xt : t e [0,r]). We consider a periodic setup, which essentially means 
that we observe a diffusion on the circle. This is motivated by applications 
in which the data consist of a recording of angles. In molecular dynamics for 
instance, certain angles in molecules are typical chemically relevant coordinates 
and can be modeled by diffusions, see [13] and [14]. We will consider Gaus- 
sian prior measures for the periodic drift function b whose inverse covariance 
operators are chosen from a family of even order differential operators. Recent 
applied work has shown that this is computationally attractive, since numerical 
methods for differential equations can be used for posterior sampling. Two illus- 
trative applications of our inference framework and algorithms to accomodate 
both continously and discretely observed data are detailed in 

In section[2]we precisely state the inference problem of interest, and describe 
the properties of the family of Gaussian priors that we adopt. We postulate a 
prior precision operator of the form 

C„-i=ry((-Af + «/), 

where A is the one-dimensional Laplacian, p is an integer and rj, n are real and 
positive hyperparameters. Working with prior precision operators has numerous 
computational advantages and a central goal of this work is to develop statistical 
tools of analysis, in particular for posterior consistency studies, which are well- 
adapted to this setting. The work of [T] developed tools of analysis which do 
this in the context of linear inverse problems with small observational noise, and 
we adapt the techniques developed there to our setting. 

An appealing aspect of choosing a Gaussian prior on the drift function b 
is conjugacy, in the sense that the posterior is Gaussian as well. Since the 
log-likelihood is quadratic in b (Girsanov's theorem) this is not unexpected. 
Formally the posterior can be computed by "completing the square" . We note 
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however that for our model, if b is distributed according to a Gaussian prior 11 
and given b, the data are generated by (jl.ip . the joint distribution of b and 
is obviously not Gaussian in general. As a result, deriving Gaussianity of 
the posterior in this infinite-dimensional setting is not straightforward. Section 
[3] is devoted to showing that, for the priors that we consider, the posterior, i.e. 
the conditional distribution of b given X^ , is indeed Gaussian. After a formal 
derivation of the associated posterior in subsection 13.21 we rigorously prove in 
Theorem 13.21 below that the associated posterior is Gaussian and obtain the 
posterior mean and covariance structure. The posterior precision operator is 
again a differential operator, involving the local time of the diffusion, and the 
posterior mean is characterized as the unique weak solution of a 2p-th order 
differential equation. In subsection 13.31 we outline how our Bayesian approach 
with Gaussian prior can be viewed as a penalized least-squares estimator, where 
the pth order Sobolev norm of b is penalized and the hyperparameters rj and 
K quantify the degree of penalization. In the inverse problem literature this 
connection is known as Tikhonov-regularisation. 

In Bayesian nonparametrics it is well known that careless constructions of 
priors can lead to inconsistent procedures and sub-optimal convergence rates 
(e.g. [TU], [7]). Consistency or rate of convergence results are often obtained 
using general results that are available for various types of statistical models 
and that give sufficient conditions in terms of metric entropy and prior mass as- 
sumptions. See, for instance, [T7], [T3], [TB], [33], and the references therein. In 
this paper however we use the explicit description of the posterior distribution, 
which allows us to take a rather direct approach to studying the asymptotic 
behavior of our procedure. In particular, we avoid entropy or prior mass con- 
siderations. 

Since the posterior involves a periodic version of the local time of the process 
X, the asymptotic properties of the local time play a key role in this investiga- 
tion. In the present setting the existing asymptotic theory for the local time of 
ergodic diffusions (cf. e.g. [35], [33]) can not be used however, since we do not 
assume ergodicity but instead rely on the periodicity of the drift function b to 
accumulate information as T — >■ oo. As a consequence, the existing posterior 
rate of convergence results for ergodic diffusion models of [2S] do not apply. In 
section|4|we therefore present new limit theorems for the local time of diffusions 
on the circle. These can be seen as extending and complementing the work of 
Bolthausen [5], who proved a uniform central limit theorem for the local time 
of Brownian motion on the circle (the case 5 = in For our purposes 

we need asymptotic tightness of the properly normalized local time in certain 
Sobolev spaces however, and we need the result not just for Brownian motion, 
but for general periodic, zero-mean drift functions 6. 

Having these technical tools in place we use them in combination with meth- 
ods from the analysis of differential equations in section [5] to obtain a rate of 
contraction result for the posterior distribution. The result states that when 
the true drift function b is periodic and p-regular in the Sobolev sense, then 
the posterior contracts around 6 at a rate that is essentially as 
T ^ oo (with respect to the i^-norm). In particular, we have posterior consis- 
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tency. Although lower bounds for the rate of convergence in the exact model 
under study do not appear to be known, comparison with similar models sug- 
gests that the optimal rate for estimating a drift function bo that is /3-regular 
(in Sobolev sense) may be T~^/(^+^''^ in our setting (in a minimax sense over 
Sobolev balls for instance, cf. e.g. [22], [31] for similar results). The general 
message from the Gaussian process prior literature is that this optimal rate is 
typically attained if the "regularity" a of the prior matches the regularity /3 
of the function that is being estimated (see [37] )• As discussed in Section [221 
the regularity of the prior we employ in this paper is essentially a = p — 1/2. 
This suggests that in principle, it should be possible to relax our assumption 
that bo is p-regular to the assumption that bo is {p — l/2)-regular, while still 
maintaining the same rate r^(p^i/2)/(2p), Jt is however not clear whether this 
can be achieved by adapting the proof we give in this paper. The method of 
proof is adapted from [T] where it is used to study linear inverse problems in the 
small noise limit. In that context the proof gives sharp rates in some parameter 
regimes, but not in others. 

There are a number of future directions that this work could be taken in. 
First of all, alternative technical approaches could be explored to derive sharp 
convergence rates. One approach could be to use the representation of the poste- 
rior mean as a minimizer of some stochastic objective functional (cf. Section r3.3l) 
and use empirical process-type techniques to study its asymptotic properties. 
This however requires technical tools (e.g. uniform limit theorems, maximal in- 
equalities) that are presently not available in this setting of periodic diffusions. 
Alternatively, sharp rates may result from a general rate of convergence the- 
ory for posteriors in the spirit of [33], if that could be developed for this class 
of models. Secondly, motivated by practical considerations, it will also be in- 
teresting to determine whether useful adaptive procedures can be constructed 
by choosing the hyperparameters p, rj and k in a data-driven way, for instance 
by hierarchical Bayes or empirical Bayes procedures. A third future direction 
concerns extension of the ideas in this paper to diffusions in more than one 
dimension. The local time is, then, a much more singular object and developing 
an analysis of posterior conistency will present new challenges. 

2 Observation model and prior distribution 

In this section we first introduce the diffusion process under study, fixing nota- 
tion and describing how we exploit periodicity; see subsection l2.1l In Subsection 
12.21 we introduce the prior we place on the drift function of the diffusion, speci- 
fying the prior precision operator and collecting basic properties. 

2.1 The diffusion 

Consider the stochastic differential equation (SDE) 

dXt^b{Xt)dt + dWu Xo = 0, (2.1) 
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where W is standard Brownian motion and 6 : M — > R is a continuously differ- 
entiate, 1-periodic drift function with zero mean, i.e. b{x -\- k) = b(x) for ah 
a; G R and k G Z and b{x)dx = 0. We let T denote the circle [0, 1) so that 
we can also write 6 : T — )• K and we summarize the assumptions on b by writing 
b S C'^(T), the dot denoting mean zero. 

For every b S C'^(T) the SDE p.ip has a unique weak solution (see e.g. 
Theorems 6.1.6 and 6.2.1 in [TT], p. 214). For T > 0, we denote the law that 
this solution generates on the canonical path space C[0, T] by P^. In particular 

is the Wiener measure on C[0,T]. By Girsanov's theorem the laws P^, 
b e C^{T), are all equivalent on C[0,r]. If two measurable maps of X are 
almost surely (a.s.) equal under some P^ they are therefore a.s. equal under 
any of the laws P^, and we will simply write that they are equal a.s.. 

Denoting the canonical process on C[0, T] by X, the Radon- Nikodym deriva- 
tive of P^ relative to the Wiener measure satisfies 

^{X) = exp(-i £ b\X,) dt + b{X,) dX,) 

almost surely, by Girsanov's theorem (e.g. [23]). Observe that by Ito's formula 
the likelihood can be rewritten as 

^(X)=exp(-$,(,;X)) 

a.s., where 

$T(&;X) = i/ {b'{Xt) + b'{Xt))dt + B{Xa)~B{XT) (2.2) 

and B' = b. Note that B is also 1-periodic, since b has average zero. 

It will be convenient to write the integrals in the expression for $t in terms 
of the local time of the process X. Let {Lt{x]X) : t > 0,x G M) be the 
semimartingale local time of X, so that 

/oo /.T 
f{x)LT{x-X)dx^ / f{Xs)ds (2.3) 
-oo Jo 

holds a.s. for any bounded, measurable / : M — >■ R. Defining also the random 
variables xt{x]X) by 

r 1 if Xo < X < Xt, 
Xt{x; X) = l-l liXr <x < Xq, 
[ otherwise, 

we may then write 

$t(6; {Lt{x; X){b\x) + b'{x)) - 2xt{x- X)b{x)) dx. (2.4) 



5 



In view of the periodicity of the functions involved it is sensible to introduce 
a periodic version L° of the local time L by defining 

L°j,{x\ X)=Y^ Lt{x + k; X) 

for X G T. Note that for every T > 0, only finitely many term in the sum are 
almost surely non-zero, so the sum is well defined. It follows from (|2.3p that for 
any 1-periodic, bounded, measurable function / and T > 0, 

f{Xu)du= f f{x)L°T{x;X)dx. (2.5) 
Jo 

Exploiting the periodicity of b and B and introducing the corresponding peri- 
odized version Xri'' ^) Xt{'', X), we can then rewrite ()2.4p as 

$t(&; ^) = 2 (^^(^' ^)(^'(^) + ^'(^)) - 2XT(a;; X)b{x)) dx. (2.6) 

We remark that under our assumptions, the random function x i— ?> L^{x; X) 
is a.s. continuous. This follows from the fact that for the diffusion X given by 
(|2.ip the local time L is continuous and there are only finitely non-zero terms 
in the sum defining L°, almost surely. In particular, we have that the norms 
||L^(-; X)||oo and ||i5^(-; X)|ji2 are a.s. finite. 



2.2 The prior 

We will assume that we observe a solution of the SDE (j2.ip up to time T > 0, for 
some b G (T) . To make inference on b we endow it with a centered Gaussian 
prior n. We will view the prior as a centered Gaussian measure on L^(T) and 
define it through its covariance operator Co, or, rather, through its precision 
operator Cq^ . Specifically, we fix hyperparameters rj, k > and p £ {2, 3, . . .} 
and consider the operator Co with densely defined inverse 

Co-i=ry((-A)'' + «/), (2.7) 

where A the one-dimensional Laplacian, / is the identity and the domain of Cq^ 
IS given by D{Co^) = H'^P{T), the space of mean-zero functions in the Sobolev 
space H'^p{T) of functions in L^(T) with 2p square integrable weak derivatives. 

To see that Co is indeed a valid covariance operator and hence the prior is 
well defined, consider the orthonormal basis 0^ of i^(T), which is by definition 
the space of mean-zero functions in L^(T), given by 

(t>2k{x) — V2 cos(27rfcx), 
02fc-i(a;) = V2sin(27rfc2;), 
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for k G N. The functfons <j>k befong to the domain ij^P(T) of the operator (|2.7p 
and 



for fc G N. It fohows that Co is the operator on L^(T) which is diagonaUzed by 
the basis (j)k, with eigenvalues 



Ajfc — 



(v[47r' - j +r^K) . (2.8) 



Thus Co is positive definite, symmetric, and trace-class and hence a covariance 
operator on i^(T). It extends to a covariance operator on the whole space L^(T) 
by setting CqI = 0. 

The integer p in p.7p controls the regularity of the prior 11 and we assume 
p > 2 to ensure that the drift is (see below). The parameter 77 > sets an 
overall scale for the precision. The parameter k allows us to shift the precisions 
in every mode by a uniform amount. We employ k > as it simplifies some 
of the analysis, but k = could be included in the analysis with further work. 
Likewise we have assumed a mean zero prior, but extensions to include a mean 
could be made. 

The preceding calculations show that the prior 11 is the law of the centered 
Gaussian process W — {W{x), a; G T) defined by 

Wix) = J2 V>^M^)Zk, (2.9) 

feeN 

for Zi,Z2,... independent, standard Gaussian random variables. Note that 
^/Xk ^ k~P asymptotically. Using also the differential relations between the 
basis functions (j)k it is straightforward to see that the process W has p — 1 > 
1 weak derivatives in the L^-sense. Moreover, using Kolmogorov's classical 
continuity theorem it can be shown that this {p — l)st derivative has a version 
with sample paths that are Holder continuous of order 7 for every 7 < 1/2. 
Combining this we see that W has a version with a-H61der sample paths, for 
every a<p— 1/2. In particular, it holds that all the mass of the prior 11 is 
concentrated on (T) . 

Note that the Karhunen-Loeve expansion (|2.9p shows that the reproduc- 
ing kernel Hilbert space (RKHS) of the prior is given by H = {J2k>i '^k4'k '■ 
Y^cl/^k < 00}. Since 1/Afc ~ fc^P, this implies that H = ijP(T). This shows 
once again that the i^-support of 11 is L^(T). We can thus view 11 as a Gaussian 
measure on any of the separable Banach spaces i^(T), C(T), C'^(T) or iJ'^(T), 
for k < p — 1. 
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3 Posterior distribution 



3.1 Bayes' formula 

If we endow C^(T) with its Holder norm and C[0,T] with the uniform norm, 
then expression (|2.2[) shows that the negative log-hkehhood {b,x) i-^ ^T{b;x) 
(has a version that) is Borel-measurable as a map from C^(T) x C[0,r] R 
Since we can view 11 as a measure on C^(T), it follows that we have a well- 
defined Borel measure n(d&)P^(dx) on Ci(T) x C[0,T], which is the joint law 
of b and X'^ in the Bayesian setup 

X'^lbr^^. 

The posterior distribution, i.e. the conditional distribution of b given X'^, is 
then well-defined as well and given by 



U{B I X^) = 1 exp {~<i>T{b; X)) n{db), 
Z = [ exp(-$T(6;X))n(d6), 



(3.1) 



for Borel sets B C C^(T), provided that Z > a.s., cf. Lemma 5.3 of [T5] . 
To see that the latter condition is fulfilled, observe that since 11 is a centered 
Gaussian distribution on the separable Banach space C^(T), endowed with its 
Holder norm, we have H(6 : ||6||oo + ll^'lloo < oo) = 1. It follows from (|2.6p that 

|<i>T(6;X)|<(l + ||L^(.;X)|U)(||6||L + ||fe'|U). 

(Here, and elsewhere, a < b means that a is less than an irrelevant constant 
times b.) Together this gives the a.s. positivity of Z, since ||L^(-; X)||oo < oo 
a.s. 

We have now defined the posterior as a measure on C^(T), but since the 
prior is in fact a probability measure on C"(T) for every a < p — 1/2 (see the 
preceding section) , it is a Borel measure on these Holder spaces as well. We can 
of course also view it as a measure on C(T) or L^(T). 

3.2 Formal computation of the posterior 

The next goal is to characterize the posterior. We proceed first strictly formally 
and non-rigorously. Very loosely speaking, we have that the prior H has a 
"density" proportional to 

b^ exp(^-^J^ b{x)CQ^b{x)dx^ (3.2) 



and the negative log-likelihood also has a quadratic form, given by (|2.6[) . This 
suggests that the posterior is again Gaussian. Formally completing the square 
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gives the relations 

Cj^' ^C-' +L°t{-,X)I, (3.3) 
C^'bT^liLU-;X)y + x°T{-;X) (3.4) 

for the posterior mean bx and the posterior precision operator C^^. 

As detailed in the preceding section we assume that the prior covariance 
operator is given by (|2.7p . with integer p > 2, r/, k > 0, A the one-dimensional 
Laplacian and D(C(7^) = H^p(T). In that case (13. 3p gives 

=Tj{-Af + {rjK + L°j^i-,X))I (3.5) 

and D{Cj.^) = H'^p{T). By standard PDE theory, the equation C^^V = 5 has 
a unique weak solution in ij^P(T) for every g e L'^{T), see e.g. [12] or [28] for 
more on the periodic case, hence Ct is well defined on all of L^(T). Moreover, 
Ct is a bounded operator, since C^^ is coercive. 

Of course the ordinary derivative of local time is not defined and we have to 
interpret p.4p in a weak sense. In order to enable us to do this, in Section[3H]we 
consider the variational formulation of equation (j3.4p . As as precursor to this, 
the next subsection is devoted to observing that the differential equation for the 
mean arises as the Euler-Lagrange equation for a certain variational problem, 
yielding an interesting connection with penalized least-squares estimation. 

3.3 Connection with penalized least squares 

Here we demonstrate the fact that the posterior mean 6t given by (|3.4p can be 
viewed as a penalized least-squares estimator in the case p — 2. Formally, the 
SDK (|2.ip can be written as 

Xt - b{Xt) + Wu 

where the dot denotes differentiation with respect to t (obviously, the derivatives 
X and W do not exist in the ordinary sense.) This is just a continuous-time 
version of a standard nonparametric regression model and for a drift function 
u, we can view the integral 

/ iXt-uiXt)fdt 
Jo 

as a residual sum of squares. A penalized least-squares procedure consists in 
adding a penalty term to this quantity and minimizing the resulting criterion 
over u. Expanding the square in the preceding integral shows that this is equiv- 
alent to minimizing 

u^-( u{Xt)dXt + \ I u^iXt)dt + Piu), 
Jo ^ Jo 
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over an appropriate space of functions, where P(u) is the penalty. 

If the function u is smooth and periodic, then by Ito's formula and the 
definitions of L° and x°, we have 

u{Xt) dXt = U{Xt) - U{Xo) - \ I u'{Xt) dt 



1 

u{x)xt{^'t ^) dx — - I L?p{x] X)u' {x) dx 
2 Jo 

^^'i T 1 

^-{Xt)dt= I u^{x)L°T{x-X)dx. 

JO 

Hence, if the functions u over which the minimization takes place are smooth 
enough, the criterion can also be written as 



u I— > 



' /I 

2'' 



u^{x)L°rr{x] X) + ]^u'{x)L°t{x- X) - u{x)xt{x; X)^ dx + P{u). 
Now consider a Sobolev-type penalty term of the form 

Piu)^lv(K[ {u{x)fdx+ I {u"{x)fdx\, 



for constants 77, k > 0. Then the objective functional u ^ A(u; X) takes the 
form 

K{u- X) = (^u^i^n + LUX)) + ^u'LUX) - ux^X) + ^^(u")') dx, 

where we omitted explicit dependence on x to lighten notation. To maximise this 
functional, simply take its variational derivative in the direction v, i.e. compute 
the limit lime_>o {^{i^ + X) — A(m; X)) /e, for a smooth test function v: 



— {v) = /' 

SU Jn 



uvL°t{X) - \{L°rp)'{X) ~ vxt{X) + r^v"u" + r]Kuv\ dx 



Integrating by parts (where the boundary terms vanish due to periodicity) now 
yields the form 

^-^{v) = j^v (^L^X) - ]^{L°t)'{X) - Xt{X) + + dx 

from which it is evident that equating the variational derivative to zero for all 
smooth test functions yields exactly the posterior mean obtained in (13. 4p for the 
case p = 2: 
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In the context of inverse problems, adding the square of the norm of the un- 
derlying vector space is known as (generalized) Tikhonov regularization, and 
the connection to Bayesian inference with a Gaussian prior is well established 
in general, see |31) . It may be viewed as a natural extension of the approach 
of Wahba [35] from regression to the diffusion process setting. The case of 
regularization through higher order derivatives in the penalization term P is 
similar. 

3.4 Weak variational formulation for the posterior mean 

In the preceding section we remarked that the RKHS of the Gaussian prior 
equals the Sobolev space HP{T). Below we prove that the posterior is a.s. a 
Gaussian measure. Moreover, since the denominator Z in p.l[) is positive a.s., 
the posterior is equivalent to the prior. It follows that the posterior mean bx is 
a.s. an element of Hp{T). By saying it is a weak solution to (|3.4I) we mean that 
it solves the following weak form of the associated variational principle: 



where the bilinear form a{-,-;X) : HP{T) x iJP(T) M and the linear form 
r{-;X) : HP{T) R arc defined by 



a{u,v;X) = ri / u^'p\x)v^p' {x) dx + rjK / u{x)v{x)dx+ / u{x)v{x)L'^{x; X) dx, 



The following lemma records the essential properties of a and r and the associ- 
ated variational problem. 

Lemma 3.1. The following statements hold almost surely: 

(i) a(-,-;X) is bilinear, symmetric, and continuous and coercive: 



aibT,v;X)=r{v;X) 



for every v G _ff^(T) 



(3.6) 




a{vi,V2;X) < {r] + r]K + \\L^{-; X)\\oo)\\vi\\hp\\v2\\hp 



for vi,V2 G HP(T) and for some constant c > 0, a{v,v]X) > c||w|||^p for 
all V e HP{T). 



(a) r{-;X) is linear and bounded: 




for all V e HP{T). 



(Hi) There exists a unique u G HP{T) such that a{u,v;X) = r{v;X) for all 
V e ijp(T). 
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Proof, (i) Bilinearity, symmetry and continuity follow straightforwardly from 
the definition of a. Coercivity follows easily from the positivity of rj and k and 
the Poincare inequality, (ii) Again, straightforward, (iii) Follows from (i) and 
(ii) by the Lax-Milgram Lemma, see [Q. □ 

3.5 Characterization of the posterior 

We can now prove that the posterior is Gaussian and characterize its mean and 
covariance operator. Recall that by saying that bx is a weak solution of the 
differential equation (13.41) we mean that it solves the variational problem (|3.6|) . 

Theorem 3.2. Almost surely, the posterior n(- 1 X^) is a Gaussian measure 
on (T) . Its covariance operator Ct is given by h3. 5\) and its mean br is the 
unique weak solution of ^3.4^ . 

Proof. For n E N, let P„ : i^(T) L^(T) be the orthogonal projection onto 
the linear span Vn of the first n basis functions 0i,...,0„. Let the random 
measure n„(- | X"^) be given by 

n„(B I X^) = i / exp (-$t(P„6; X)) n(d&), 

Zn= f exp(-$T(-P«6;^))n(d&), 

for Borel sets B C C^(T). The fact that this random measure is well defined 
can be argued exactly as in Section 13.11 

For b £ C^{T) it holds that Pnb — > 6 in H^{T) as n — > oo. It is easily seen 
from (|2.6p that the random map b i-> ^T{b;X) is a.s. iJ^(T)-continuous. It 
follows that a.s., b i— >■ ^riPnb; X) converges pointwisc to ^t{'',X) on C^{T). 
By Lemma [3.31 below . there exists for every e > a constant K{e) such that 

-$„(6;X) < e\\bfH,+Kie){l + \\L°Ti-;X)\\l,), 

g-<E.„(P„6;X) < gK(£)(l+||L°(.;X)||^2)p/l|fcf„l, 

Since II can be viewed as a Gaussian measure on H^{T), Fernique's theorem 
implies that a.s., the right-hand side of the last display is a Il-integrable function 
of 6 for £ > small enough (see fT, Theorem 2.8.5). Hence, by dominated 
convergence, we can conclude that Z„ ^ Z almost surely. The same reasoning 
shows that for every Borel set B C C^(T), it a.s. holds that 

/ expi-^T{Pnb;X))Uidb) ^ / exp{-<l>Tib;X))nidb) 

JB JB 

as n — > oo. Hence, we have that with probability 1, the measures H„(- \X'^) 
converge weakly to the posterior H(- \X'^). Note that the weak convergence 
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takes place in C^(T), but then in L^(T) as well. Since the measures n„(- 1 X'^) 
are easily seen to be Gaussian, the measure n(- | X^) must be Gaussian as well. 

If we view (T) as the product of Vn and V^^ , then by construction the mea- 
sure n„(- I X'^) is a product of Gaussian measures on Vn and V^^. The measure 
on Vn really has density proportional p.2[) on Vn (relative to the pushforward 
measure of the Lebesgue measure on R" under the map (ci, . . . , c„) ^ Ck4ik)- 
The formal arguments given in Section 13.21 can therefore be made rigorous, 
showing that this factor is a Gaussian measure on Vn with covariance operator 
PnCrPn and mean &„ g Vn which solves the variational problem 

a{bn,v;X) = r{v;X) 

for every v G Vn- The measure on V^^ has mean zero, so 6„ is in fact the mean 
of the whole measure n„(- 1 X^). The covariance operator of the measure on 
is given by (/ - P„)Co(/ - Pn)- 

Next we prove that the posterior mean St is the weak solution of (|3.4p . By 
Lemma [3.11 there a.s. exists a unique u g HP{T) such that a{u,v;X) = r{v;X) 
for all V G HP{T). Standard Galerkin method arguments show that for the 
mean of n„(- | X'^) we have 6„ — >■ u in HP{T). Indeed, let e„ = u — &„. Then 
we have the orthogonality property a(e„,u;X) = for all v £ Vn- Using the 
continuity and coercivity of a(-, -'tX)., cf. Lemma |3.1[ it follows that for v G V^ 

c|le„||li-p < a(e„,e„;X) 

= a(e„,M - v;X) 

< {r] + r]K+\\L°j.{-;X)\\^)\\en\\Hp\\u-v\\Hp. 

Hence, for every u g 14, we have 

c||e„||i/P < {i] + rjK+ \\Lt{-; X)\\oo)\\u - v\\hp- 

By taking v — PnU we then see that 6„ ^ u in H'p{T). On the other hand, by 
the weak convergence found above, 6„ converges a.s. to the posterior mean bx 
in L^(T) (see |4], Example 3.8.15). We conclude that hr a.s. equals the unique 
weak solution u of p.4p . as required. 

It remains to show that the covariance operator of the posterior is given 
by (13.51) . Let I]„ = PnCrPn + [I — Pn)Co{I — Pn) be the covariance operator 
of n„(- 1 X'^) and let E be the covariance operator of the posterior n(- | X^). 
Since the measures converge weakly and are Gaussian, we have that for every 
/ e L^(T), !]„/ S/ in L^(T) (cf. Example 3.8.15 of [4] again). On the other 
hand, for n > k and g € L^(T) we have 

{g, S„0fc) - {g,CT<Pk) L2 = I {g, {Pn - I)CT(f>k) ^2 | 

< \\{Pn-I)g\\L-\\CT<lyk\\L-^0: 

hence Tin4>k converges weakly to CTfj^k- It follows that = CT4>k for every k 
and the proof is complete. □ 
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Lemma 3.3. For every e > there exists a constant K{e) > such that 
-^{b-X) < s\\b\\l, + Kie)il + \\LU-;X)\\l,). 

Proof. It follows from that 

-$(6;X)<+i/" L°T{x;X)\b'{x)\dx+ f \b{x)\dx. 
2 Jo Jq 

Now note that for every /3 > and /,5 e L'^{T), it holds that 2{f,g) < 
P\\f\\^ + P~^\\9\\^ ("Young's inequality with e"). Applying this to both integrals 
on the right we get 

-^b;X) < ^\\LU-;X)\\l. + j^\\b'\\l. + ^ + ^\\b\\l. 

<s\\b\\%. + j- + ^JLU-;X)\\l,, 

where e = (2^)-\ so 7^(e) = ^. □ 

4 Asymptotic behaviour of the local time 

In the next section we will investigate the asymptotic behaviour of the posterior, 
using the characterization provided by Theorem 13.21 Since p.3p and p.4p in- 
volve the periodic local time L^{-', X), the asymptotic properties of that random 
function play a key role. 

The results we establish in this section can be seen as complementing and 
extending the work of Bolthausen [5] in which it is proved that if X is Brownian 
motion (i.e. 6 = in p.ip i. then the random functions 

converge weakly in the space C(T) to a Gaussian random map as T oo. For 
our purposes we need weak convergence (or at least asymptotic tightness) in the 
Sobolev space H"{T), for a < 1/2, and we need the result not just for Brownian 
motion, but for general periodic, zero-mean drift functions b. Moreover, we need 
the associated uniform law of large numbers which states that 

converges uniformly as T — )■ oo. Similar statements were obtained for ergodic 
diffusions in the papers }36j and |34j . In the present periodic setting however, 
completely different arguments are necessary. 

Given b E C{T) we define the probability density p on [0, 1] by 

p(x)=Cexp(2^ xe[0,l], (4.1) 
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where C > is the normahzation constant that ensures that p integrates to 1. 
Note that since b has mean zero, p satisfies /5(0) = p(l) and enjoys a natural 
extension to a periodic function. 

Theorem 4.1. 



(i) It almost surely holds that 



sup 



-L°rp[x;X) - p{x) 



as T 



oo. 



(ii) For every a < 1/2, the random maps 

x^VT(^^L?r{x-X)-p{x) 



are asymptotically tight in iJ"(T) as T oo. In particular, for every 
a < 1/2, 



T 



LU-;X)-p 



as T oo. 



The proof of the theorem is long and therefore deferred to the final section 
[6] in order to keep the overarching arguments in this paper, which are aimed to 
proving posterior consistency, to the fore. In the following section about pos- 
terior contraction rates we need the fact that Xt — Op{\/T) for the diffusions 
with periodic drift under consideration. As we are not aware of an existing 
reference to this statement, we derive it from the preceding theorem. 



Corollary 4.2. For every b G C(T), the weak solution X 
SDE JO]) satisfies Xt = Op{VT) asT ^oo. 



{Xt:t> 0) of the 



Proof. We have 



Xr, 



b{Xs) ds + Wt 



for a standard Brownian motion W. Since b is 1-pcriodic, the integral can be 
rewritten in terms of the periodic local time L°. Moreover, (|4.ip implies that p 
is 1-periodic as well and p' = 2bp, hence 
»i 



bix)p{x)dx^^ip{i)-pm = o. 



It follows that 



\Xt\ < T 



b{x)[ 



L°j,{x;X) 
T 



p{x))dx +|Wt| 



T 



L°t{-.X)-p 



L2 



\Wt\. 



By the preceding theorem, this is Op{\/T). 



□ 
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5 Posterior contraction rates 



In this section we use the characterization of the posterior provided by Theorem 
13.21 and the asymptotic behaviour of the local time established in Theorem 14. II 
to study the rate at which the posterior contracts around the true drift function, 
which we denote by bo to emphasize that the results are frequentist in nature. 

The first theorem concerns the rate of convergence of the posterior mean bx, 
which, by Theorem l3.2[ is the unique weak solution in HP{T) of the differential 
equation p.4p . 

Theorem 5.1. Suppose that the true drift function &o G HP{T). Then for every 
d>0, 

as r — >■ cx). 

Proof. By Theorem 13.21 we have (in weak sense) 

Note that it follows from (|4.1I) that the invariant density p satisfies p' = 2bop if 
6o is the drift function, hence, with Gt = VT{L^{-] X)/T — p), 

(Cq-i + x))bo - \{LU-,x)y + x°Ti-;X)+ c-X 

+ VTGTbo-^VTG'T-XT{-,X). 

Subtracting the two equations shows that e — br — bo satisfies (still in the weak 
sense) 

{Co' + LU-, X))e = -Co'bo ~ VfGrbo + \Vf<G'T + xt{-\ X). 

Since p is bounded away from zero (see (|4.1|) ) it follows from statement (i) of 
Theorem 14. II that there exists a constant c > such that 

infM:(^>c 

xeT T 

on an event At with Vij^{At) —^1. As a consequence, testing the weak differ- 
ential equation for the error e with the test function e itself ( "energy method" ) 
yields, on the event AT^ the inequality 

\\Co"^e\\l.+cT\\e\\l. 

< I {Co%,e) I + I {xU-, X),e) \ + Vt\ (Gt&o, e) I + \Vt\ (Gt, e') | 

= I lCoho,CPe) I + I {x°t{-,X), e) I + VT| (Gt^o, e) | + \Vt\ (Gt, e') |. 
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Using Young's inequality 2 {f^g) < k||/|||2 + ^^""^1151112 on the first three terms 
on the right with appropriate k's and subtracting the resuhing terms involving 
T||e||^2 from both sides we get, still on At, 

\\C,'^'e\\h+cT\\e\\l. 

< l|C„-^/^6o||i2 + "^^^•^''^"'^ + ||6o||LllGT||i2 + lVf\ (Gr,e') |. ^^''^ 

We note that first three terms on the right are now stochastically bounded: the 
first one is constant, the second is bounded by a \Xt — Xgp/T, which is Op(l) 
according to Corollarv l4.2[ and the third one is Op{l) by Theorem 14. II 

For the last term on the right we have, since the norm \\C^'^/'-^p^ ■ 11^2 is 
equivalent to the H^-noim and C^^^^^^ -S^ is bounded. 



,e')| = |(Co^^GT,Ce') 



dx 

' ' < \\GT\\H^\\C^e\\L2 



We now use the interpolation inequality given as Theorem 13 on p. 149 in [9], 

\\A''u\\ < 

which is valid for l G (0, 1) and positive, coercive, self-adjoint densely defined 

_ 1 

operators A. We take A = Cq and l = {1 — s)/p and combining with what we 
had above we get 

Vf\ {GT,e')\<T-'i^\\GT\\HA\Co^eff\\^4p^- 
Using Young's inequality for products we have the further bound 

Vt\ (Gt, e') I < KT-'r^WGrWh + ^-'WCo^eWl, + n-'T\\e\\l,. 

If we combine this with (|5.ip . choose k large enough and subtract k~^||Cq ^ e||^2 + 
K~^r||e||^2 from both sides of the inequality we arrive at the bound 

T||e||i2<Op(l)+T-"^||GT||lf=, 

which holds on the event At- In view of Theorem 14.11 and since s G (0, 1/2) is 
arbitrary, this completes the proof. □ 



Theorem 15 ■ 1 1 onlv concerns the posterior mean, but we can in fact show that 
the whole posterior distribution contracts around the true bo at the same rate. 
As usual, we say that the posterior contracts around fog at the rate St (relative 
to the L^-norm) if for arbitrary positive numbers AIt — > oo, 

Eb„n(6: ||&-6o||l2 >MTeT\X^)^0 

as T — > cxo. This essentially says that for large T, the posterior mass is concen- 
trated in L^-balls around bo with a radius of the order et- 
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Theorem 5.2. Suppose that bo G ^^(T). Then for every S > 0, the posterior 



£-1/2 



contracts around at the rate T '^p '^^ as T oo 



p-l/2 , c 

Proof. Set Et ~T . By the triangle inequality, 

E6„n(6: ||&-6o||l^ > A/tEt | X^) 

< E,„n(fe : ||6 - ItWl- > I X^) + (||St - 6o|1l^ > 



By Theorem 15.11 the second term on the right vanishes as T — > cx), hence it 
suffices to show that 11(6 : ||6 — 6t||l2 > MtSt/2 \ X^) converges to in Pb^- 
probabihty. By Markov's inequahty, this quantity is bounded by 
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j \\bT~b\\l,IV{db\X^ 



Since the integral is equal to the trace of the covariance operator of the centered 
posterior, it suffices to show that tr(CT) = Op{e'^). 

As before, let and (t)i be the eigenvalues and eigenfunctions of the prior 
covariance operator Cq. For every iV € N we have 

tr(CT) = i^^^CT(t>^) + ^ (0.,Ct<^») ■ 

i<N i>N 

To bound the second sum on the right we note that in view of p.3p we have 
^ > Cq^ and it is not difficult to see that as a consequence, Ct < Co- Hence, 
since Xi ^ the second sum is bounded by a constant times N^^'^p. By 

Cauchy-Schwarz the first sum is bounded by J2i<N IICt^iHl^- To further bound 
this, we observe that 

M L°T{x;X)\\CTMh < I {CTM^)fL°r{x;X)dx 







< 



1 

-1 



CtMxKCo' + L?r{-\X))CMx) dx 
1 



<j)i{x)CT(t>i{x) dx 

< \\4'i\\L2\\CT(t>i\\L2 = ||CT0i||L2. 







Dividing by ||CT'/'i||L2 shows that HCT^ilU^ < 1/ ^nixei: L^{x] X) and hence, by 
the first statement of Theorem 1411 ||Ct(/',;||l2 = Op(l/T). 

Combining what we have we see that tr(CT) < NOp{l/T) + N^-^p for 
every N £ N. The choice N ~ T^/(^p) balances the two terms and shows that 
tr(CT) - Op(T(i-2p)/(2p)) ^ Op(4). □ 



Remarks 5.3. It is clear from the proof of Theorem 15.^1 that the posterior 
spread J \\bT ^ bo\\^2^idb \ X"^) is always of the order T^^^'^p'^/'--^p\ regardless of 
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the smoothness of the true drift function bo ■ Hence if the rate result of Theorem 
I5.il for the posterior mean can be improved, for instance the condition that 
bo G ij^(T) can be relaxed to the assumption bo € Hp~^^''^{T) (see the discussion 
in the introduction), or the S can be removed from the rate, then the result of 
Theorem for the full posterior automatically improves as well. 

We also note that the proof of Theorem 15. Jl delivers convergence rates in 
other norms. In particular it yields 

||6t-6oI|hp =Op,jr*+*) 

and hence, by interpolation, we have that the error in the mean converges to 
zero as 

for < s < p — i . 

6 Proof of Theorem 14.11 

6.1 Semimartingale versus diffusion local time 

Throughout this section, the drift function b G C{T) is fixed. The weak solution 
X of the SDE (|2.1I) is a regular diffusion on M with scale function s given by 

r \ r r b(z)dz , 
s[x) = e ■'yo ^ ' dy. 

Jxa 

We choose a^o and yo such that s(0) — and s(l) = 1. The speed measure m 
has Lebesgue density 1/s'. Since b is 1-periodic and mean-zero the function s' 
is 1-periodic as well. It follows that m is 1-periodic and that s satisfies 

s{x + k) ^ s{x) +k, (6.1) 

for all a; e R and fc g Z. 

The periodic local time L° was defined through the semimartingale local 
time L of the diffusion X, for which we have the occupation times formula (j2.3p . 
The diffusion X also has continuous local time relative to its speed measure, 
the so-called diffusion local time of X. We denote this random field by (^t(x) : 
t > OjX G M). It holds that t ^ it{x) is continuous and for every < > and 
bounded, measurable function /, 

^ /(X„) du^ f{x)£t{x) m(dx) (6.2) 

(see for instance [10] )■ For this local time we define a periodic version t° as well, 
by setting 

l1{x)=Y.^t{x + k). 
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The periodicity of m then imphes that for every 1-periodic, bounded, measurable 
function /, 



fiXu)du 



J{x)l°{x)m{dx). 



(6.3) 



Comparing this with (|2.5I) we see that we have the relation s' {x)L?p{x; X) = 
i?p{x) for every T > and x € [0, 1]. Now note that 1/s' is up to a constant 
equal to the invariant density p defined by (|4.1I) . Since p is a probability density 
on [0, 1] and 1/s' is the density of the speed measure m, we have m[0, l]p = 1/s' 
on [0, 1]. Therefore, statement (i) of Theorem 14. II is equivalent to the statement 
that 



sup 







(6.4) 



t'^'"' m[0,l] 

a.s. as t — > cx), and statement (ii) is equivalent to the asymptotic tightness of 



(6.5) 



in i7"(T) for every a G [0, 1/2). We will prove these statements in the subse- 
quent sections. 



6.2 A representation of the local time up to winding times 

If we do not write it explicitly, we work under the measure Pq, i.e. X is started 
in 0. We define a sequence of Po-a.s. finite stopping times tq, n, . . . by setting 
To = 0, Ti is the first time X exits [—1, 1], T2 is the first time after ti that X 
exits [Xt^ — 1, + 1], etc. (Note that if we define a process Z on the complex 
unit circle by Zt = exp(2i7rX(), then is the time that the process Z completes 
its fcth winding of the circle.) 

The following theorem gives a representation for the periodic local time of X 
up till the nth winding time. The representation involves a stochastic integral 
relative to s{X). The process s{X) is a diffusion in natural scale, hence a 
time-changed Brownian motion, and hence a continuous local martingale. 

Theorem 6.1. For x G (0, 1), 

n 

-tl^[x)-l^-Y,Uu{x), 
n " n ^-^ 

k=l 

where C/i, . . . , [/„ are i.i.d. continuous random functions, distributed as 

U{x)^erAx)+iriix-l)-l (6.6) 

= Xr,{l~2s{x)) + 2 f ' MXu)ds{X^), (6.7) 
Jo 

where (j)^ = l(i:-i,oc) - l(-oo,x]- 
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Proof. For fc e N we write = {Xr^-t+t ~ ^ru-i ■ t > 0) and = mi{t : 
\Xf\ = 1}. By Lemma [6.21 ahead, the processes {X^ : t E [0,t^]) are indepen- 
dent and have the same distribution as {Xt : t e [0, ri]). It foUows that for 
X E (0, 1), with £^ denoting the diffusion local time of the diffusion Z, 

1 " 

-£°Jx)-l = -y2Uk{x), (6.8) 

k=l 

where 

Ukix) = e^,"{x) + ef,"ix-i)~i, 

and the Uk are independent copies of the random function U defined by (|6.6p . 

Now let Y = s{X). Then 1^ is a regular diffusion in natural scale (i.e. 
the identity function is its scale function) and the speed measure of Y is 
related to the speed measure m of X by to = to^ o s. It is easily seen that 
for the local time of Y relative to its speed measure mX , we have £t{x) = 
£Y{s{x)). For diffusions in natural scale, the diffusion local time coincides with 
the semimartingale local time (see [5^, Section V.49). In particular, the Tanaka- 
Meyer formula holds: 

iY{x) = \Yt -x\-\x\- f sign(y„ - x) dYu (6.9) 

under Pq. In view of (j6.ip n is also the first time that Y exits [—1, 1], so we 
have that Xr^ = Yt^. Using also the fact that the scale function s is strictly 
increasing, we obtain 

IrA^) - \Xr, - s(x)\ - \s{x)\ - / ' sign(X„ - x)ds{Xu). (6.10) 

JQ 

Together with (|6.ip this implies that (|6.7p holds. □ 

The proof of the theorem uses the following lemma, which implies that X 
"starts afresh" after every winding time r^. Let {Tt ■ t > 0) denote the natural 
filtration of the process X. 

Lemma 6.2. For every Vq-q.s. finite stopping time r such that Xr E Z a.s., 
it holds that the process {Xr+t — Xr '■ t > 0) is independent of Tt and has the 
same law as X under Pq. 



Proof. Fix a measurable subset C C C[0, oo). By the strong Markov property 
we have 

Po{Xr+.-Xr EC\Tr)^f{Xr) 

a.s., where f{x) — Fx{X — Xq E C). The periodicity of the drift function implies 
that for every fc e Z, /(fc) = Pfe(X - k E C) = Po(X E C). Hence we have 

Po(X,+ . ~XrEC\Fr)= Po(^ e C), 

a.s., which completes the proof. □ 
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Since we will be interested in the local time up till a deterministic time t, it 
is necessary to deal with the time interval between t and the previous or next 
winding time. The following lemma will be used for that. For t > 0, let the 
Z_|_-valued random variable nt be such that r„j. is the last winding time less or 
equal to i, so Tnt <t < t„j+i. 

Lemma 6.3. For all t > and Borel sets B C C[0, 1], 

Po(^°„^+, - e - EoP^.-x„, (^°, e B). 



Proof. We split up the event of interest according to the position of X at time 
r„j . For k Cz we have 

Po(C,+, - e S;^r„, =k) = FoiC,., - n e = fc), 

where at^k — inf{s > t : \Xs — fc| > 1}. Let (Ts : s > 0) be the natural filtration 
of the process X. Since ^t„j is J^t-measurable, conditioning on Tt gives 

Po(^°„^^, e B-Xr,^^ = fc) -Eol{x.„^=fc}Po(C,, -4° e B| -Ft). 

By the Markov property, the conditional probability equals FxX^°a„ ^ ^ B). By 
the periodicity of the drift function, this is equal to Fxt-k{f^a„ a ^ B). Since 
(To,o = Ti, we obtain 

Po(C^^^ - e = fc) = Eol{x.^^^fc}Px,-fc(^°, e B). 

Summation over k completes the proof. □ 



6.3 Proof of statement (i) of Theorem 14.11 

In this section we prove that (j6.4p holds a.s. for T — > oo, which is equivalent to 
statement (i) of Theorem 14.11 

According to Theorem 16.11 we have 

1 1 " 

-£:jx)-l = -^t/fe(x), 

k=l 

where the Uk are independent copies of the continuous random function on 
[0,1] given by (HH). Now E\\U\\oo < 1 + 2E sup|^|<i (a;). To bound the 
expectation, we again use the fact that iri{x) — i^^{s{x)), for Y = s{X). 
Relation (|6.ip implies that sup|^|<i In (x) = sup|^|<i (x). Applying the BDG- 
type inequality for local times to the stopped continuous local martingale V^^ 
(see [27], Theorem XI. (2.4)) we then see that for some constant C > 0, 

E||f/||oo < 1 + 2E sup l^i^) < l + CEsup|rt| <oo. 

\x\<l t<Tl 
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Since by (|6.ip it holds that = ±1 with equal probability, it easily derived 
from (|6.7I) that EU{x) = 0. By the Banach space version of Kolmogorov's law 
of large numbers (see [33], Corollary 7.10), it follows that 



sup 

j:e[o,i] 



(6.11) 



a.s. 



The random variables ti, T2 — ti, T3 — T2, . . . are i.i.d., so by the law of large 
numbers, T„/n Eti a.s.. Applying relation (|6.2p with t = ti and / = 1 we 
see that 



Eti 



E^T-j (a;) m{dx). 



Since X^.^ — ±1 with equal probability, (I6.10p implies that E£t-j [x) — \ — \s{x)\. 
Using (j6.ip and the periodicity of m, it follows that 



{1 + s{x)) m{dx) + I [1 — s{x)) m{dx) 



(I + s{x - I)) m{dx) + I (1 - s(a;))TO(dx) = to[0, 1]. 



Combining (|6.11l) with the fact that T„/ri — > to[0, 1] a.s., we find that 



sup 

Ee[o,i] 



r/^"^"'^ m[0,l] 



(6.12) 



a.s.. 



Now let rit be defined as before Lemma 16.31 so that t„j < t < Tn^+i- Then 
as t — >■ c» it holds that nj — >■ 00 and hence Tnjnt — >■ m[0, 1] a.s. and Tnt+i/nt — >■ 
m[0, 1] a.s.. It follows that nt/t — >■ l/m[0, 1] a.s., and therefore also Tnjt 1 
a.s.. We can write 

jiUx) = ^^CJx) + jiiUx) - CJx)). 

Relation (|6.12p shows that a.s., the first term on the right converges uniformly 
to 1/to[0, 1]. The second term is nonnegative and bounded by 



-{£°,ix)-£°(x)) 



Tnt + l 1 
t Tnt + 1 



t Tn 



ix), 



which converges uniformly to by the preceding. This completes the proof of 
and hence of statement (i) of Theorem 14.11 



6.4 Proof of statement (ii) of Theorem 14.11 

In this final subsection we prove that the random maps (j6.5p are asymptotically 
tight in the space in iJ"(T) for every a E [0, 1/2), which is equivalent to state- 
ment (ii) of Theorem 14. II It is most convenient and of course not restrictive to 
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work with the complex Sobolev spaces. Let ek{x) = exp(i2fc7rx), k G Z, he the 
standard complex exponential basis of L^[0, 1]. For a > 0, define the associated 
Sobolev space 

7J"[0, 1] = {/ G L2[0, f] : WfWl. ^ (/' ^fc) I' < 

where (/, g) ~ jj^ f(x)g{x) dx is the usual inner product on i^[0, f]. 

By the representation of the local time given by Theorem 16 . 1 1 and the central 
limit theorem for Hilbert space- valued random elements (e.g. [23], Corollary 
10.9), we have that 

V^(i^°^ - l) (6.13) 

converges weakly in _ff"[0, 1] if 

1. E\\U\\%a, < OO. 

2. EC/ = (where the expectation is to be interpreted as a Pettis integral). 

We will show that these conditions hold if (and only if) a < 1/2. Slightly 
abusing notation, denote the two functions on the right of (|6.7p by Ui and C/2. 
We will show that conditions 1-2 hold for Ui and U2 separately. 

To show that the conditions hold for Ui, recall that ± 1 with equal 
probability. Hence, EUi = and E||C7i|||^„ = ||1 - 2s|||^„ < 00. 

As for U2 , using (|6.7p and the stochastic Fubini theorem it is readily checked 
that 



{U2,ek)^2 Ck{Xu)ds{Xu), 



where 



i2k-n 



i2k-K 

J2k{x + 1)TT 



i2k7i 



Ckix) = < 



i2k'K 

^i2k'iTX 

i2k'K 

^i2k'jT 

i2k'K 



if a; + 1 < 0, 
ifa;<0<x-|-l<l, 
ifO<a;<l<a; + l, 
if x > 1. 



To show that condition 1. holds for C/2, note that for u < ti it holds that < 
1. It is straightforward to see that for |a;| < 1, we have |cfc(a;)| < C(l + |fc|)~^ 
for some C > 0. Therefore, by the Ito isometry, 

eEI^I'" / \k{Xu)ds{Xu) ' = V|fc|2"E / ' \ck{Xy d{s{X))^ 
Jo Jo ' 



<C'K{s{X))^^J2 



(l + |fc|)2 
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The sum on the right is finite if a G [0, 1/2). For the difi^usion Y ~ s{X) the 
difi'usion local time coincides with the semimartingale local time, hence 



The Tanaka-Meyer formula and optional stopping imply that for |a::| < 1, 

E£^^{x) =E\Yr, -x\- \x\ = 1- |a;|. 

Hence, U2 satisfies condition 1. Finally, note that to show that KU2 = 0, it 
suffices to show that EU2{x) = for every fixed x e (0,1). But this follows 
readily from (j6.6p again, by optional stopping. So indeed the random maps 
((Oa converge in iJ"[0, 1] for every a e [0, 1/2). 

To complete the proof we consider the decomposition 



II 1 



m[0, 1] 

\nt + l J\ t \ t m[0,l]y yTt 

Since nt/t — >■ l/m[0, 1] a.s., the tightness of the maps (|6.13l) implies that the first 
term is asymptotically tight. By the central limit theorem, y/n{Tn/n — m[0, 1]) 
converges in distribution. Together with the inequality <t < Tn^+i and the 
delta method this implies that the second term is asymptotically tight as well. 
For the last term, note that by Lemma [6.31 we have, for M > 0, 





> m) < 




J 



>ilM)< -r^ sup EaUrAW. 



\a\<l MH \a\<l 



Similar considerations as used to show that condition 1. above holds for U2 show 
that the supremum over a on the right-hand side is bounded. We conclude that 
the last term in the decomposition is op(l). This completes the proof. 
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